3.5 Q6: Exploring the sentiments and event study

3.5.1 What sentiments are prevalent in posts about Dogecoin, and did they change in response to major events?

Elon Musk has been a vocal proponent of Dogecoin, frequently discussing and promoting the cryptocurrency on social media, which often influences its market value. His tweets and comments can cause significant fluctuations in the price of Dogecoin, demonstrating his substantial impact on the crypto market. Musk’s endorsement has helped to elevate Dogecoin from a lesser-known digital currency to a prominent player in the cryptocurrency space. In the early hours of November 1, 2022 (just after 12 AM), Musk tweeted a picture of Shiba wearing a Twitter T-shirt which likely led to an uptick in dogecoin’s price. We look at the nature of posts before and after this event. Using sentiment analysis, we calculate a compound average score - where higher value indicates more positive sentiment.

Figure 4: Lineplot of change in average sentiment score

We are able to make a few  observations, by isolating our view to a window of 14 days before and after November 1, 2022, when the tweet was made. Activity on r/CryptoCurrency with respect to posts containing the word doge had seen a revival after the tweet. It was seen that for several days before the tweet, doge activity on that subreddit had virtually hit a snooze. We see a revival in activity for both subreddits aligning with when the tweet was made alongwith a sustained average sentiment score being maintained.

3.5.2. Do highly active users in both subreddits post distinct content?

In the realm of natural language processing, embeddings are high-dimensional vectors used to capture the semantic properties of text. These vectors transform textual data into a format that machines can ‘understand’. From the results of the Spark Processing Job earlier, we generate embeddings of the post titles using BERT Sentence Embeddings trained on Wikipedia and BooksCorpus and fine-tuned on SST-2. 

To aid in visual interpretation of these embeddings, the dimensionality reduction technique known as t-SNE (t-distributed Stochastic Neighbor Embedding) is employed. This method reduces the complex, high-dimensional data into a 2-dimensional space, enabling easier visualization and analysis of the relationships and clusters within the data, such as differentiating textual content across various subreddits.

This generates a column containing 768-dimensional vector which represents the text. To easily visualize it, we use t-SNE to reduce the embeddings to 2 dimensions and plot them. The color represents membership of different subreddits. We have three groups of users: those who are only part of r/CryptoCurrency (red), those only part of r/dogecoin (blue) and the highly active users who are part of both subreddits (green). However, as the plot shows, the posts of highly active users are not distinctly different in content/meaning compared to users who are only active in one subreddit. 

Figure 5. Scatter plot of t-SNE embeddings of post titles

Elon Musk has been a vocal proponent of Dogecoin, frequently discussing and promoting the cryptocurrency on social media, which often influences its market value. His tweets and comments can cause significant fluctuations in the price of Dogecoin, demonstrating his substantial impact on the crypto market. Musk’s endorsement has helped to elevate Dogecoin from a lesser-known digital currency to a prominent player in the cryptocurrency space. In the early hours of November 1, 2022 (just after 12 AM), Musk tweeted a picture of Shiba wearing a Twitter T-shirt which likely led to an uptick in dogecoin’s price. We look at the nature of posts before and after this event. Using sentiment analysis, we calculate a compound average score - where higher value indicates more positive sentiment.